\chapter[Measuring figures of merit]{Measuring density matrix figures of merit without
  measuring the density matrix}
\begin{quote}
How little is purity known in the world. How little we value it. What
little care we take to preserve it; what little zeal we have in asking
for it, since we cannot have it of ourselves.\\
--St. John Vianney
\end{quote}

\section{Introduction}
While quantum state tomography has proven to be a very useful
tool in experimentally characterizing quantum systems, it has a
very serious limitation in that its complexity grows
exponentially in the number of degrees of freedom in the
system. In a qubit system of $N$ qubits, a minimum of $4^N$ measurements
need to be done to in order to measure the density matrix.  For
ten qubits this is just over a million measurements.   Similarly the difficulty
of reconstructing the density matrix from the measurements grows
exponentially in the number of qubits, resulting in the need for
greater classical computational resources.  This
exponential complexity scaling in the measurement
and calculation of density matrices was most pronounced in the
tomographic characterizations of eight qubit states of trapped
ions\cite{Haffner2004}.  There, twenty-four hours of measurement
were required to characterize the state, followed by several
weeks of maximum-likelihood fitting on a computer cluster.  The
same experimental apparatus was capable of making states of up
to 12-qubits, but the difficulty in performing quantum state
tomography made the group decide to limit the size of the state
to eight qubits.

This poor scaling of quantum state tomography with system size
does not bode well for its potential as a tool in characterizing
quantum computers, which, in order to be useful for tasks like
factoring, will require at
least thousands and possibly millions of qubits.  Nor is it a
particularly surprising characteristic of quantum states.
To characterize a classical probability distributions we also
need to make a number of measurements that scales with the
number of possible states.  The number of possible states in turn scales exponentially in the number of
bits.  Despite this, very complex classical systems are made and
characterized, not by trying to map out the complete
probability distribution of all possible states of the system,
but by defining figures of merit by which the system can be
evaluated.  For example, an engineered system might be characterized by its
failure rate, or a physical system by thermodynamic variables
like temperature and entropy.  In order to extend our current
success in quantum state engineering to devices that can hope to
be useful, a similar approach must be taken with quantum states.

Indeed, there are some figures of merit that are already
commonplace in the quantum information world.  Some of them,
like the tangle, concurrence and negativity measures arose
because of the desire to develop locally invariant descriptions
for entanglement.  Others, notably the purity and the von
Neumann entropy, try to quantify the width of the density
matrix viewed as a probability distribution over states.    

These figures of merit used to describe states have typically
been measured by first doing quantum state tomography to determine the state
and then calculating the figure of merit from the density
matrix.  This approach is necessary because generally the
figures of merit of interest are non-linear functionals\footnote{A functional is a function
  defined on the space of operators.  Given an operator a
  functional returns a number.} of the density matrix.  This
means that they cannot be expressed as an expectation value of
an operator acting on the space of quantum states because any
such operator ${\bf A}$ will necessarily be linear in the
density matrix: $\left< {\bf A}\right>=\text{Tr}\left[\rho {\bf
    A}\right]$.  Since they cannot be measured directly, the
only recourse has been to completely characterize the state in
terms of the $4^N$ density matrix elements and then calculate
the figure of merit from these.  Clearly this approach will
suffer from the same scaling problem as quantum state tomography
itself, a resource requirement that scales exponentially in the
size of the state.

In this chapter we present the implementation of a solution to
this problem proposed by Todd Brun\cite{Brun2004} that will work whenever the
figure of merit is polynomial in the density matrix.  This is an
important class of figures of merity which includes the
purity and other unitary invariants such as Kempe's invariant
and the 3-tangle\cite{Coffman2000,Kempe1999}, as well as the Q-measure of Meyer and
Wallach\cite{Meyer2002}.  Other quantities that are analytic in the density
matrix such as the von Neumann entropy can be approximated by a
truncated Taylor series, while the concurrence\cite{Wootters1998},
entanglement of formation\cite{Bennett1996_2}, entanglement of
distillation\cite{Bennett1996}, and the
negativity of the partial transpose\cite{MikeandIke} can be approximately
restated as simple function of such truncated Taylor series. 

In particular we will apply Brun's technique to measure the purity of a one-qubit
density matrix as a `single-shot' measurement, without
measuring the density matrix itself.  In so doing, we will
investigate different techniques for making impure polarization
states and find that they are not all the same and do not all
result in the same measured purity using this technique, even
when they are described by the same density matrix.  This difference
arises because of the effect of the `environment' whose
entanglement with the system gives rise to the impurity.

Related experimental work on directly measuring the purity has
been done by Du et al\cite{Du2005} in NMR systems and by Bovino et
al\cite{Bovi2005} in a four-photon system.  Work on measuring another
figure of merit, the concurrence\cite{Wootters1998}, was
published\cite{Walb2006} while this experiment was in progress.  

\section{Multiqubit measurements}
Brun's method relies on the insight that $m^\text{th}$-order
polynomial functionals of the density matrix can be thought of
as linear functionals of a density matrix consisting of $m$ or more
copies of the state.

If the state of a quantum system is described by a density
matrix $\rho$,
\begin{equation}
\rho=\sum^{d-1}_{i,j=0}\rho_{ij}\ket{i}\bra{j},
\end{equation}
then Brun defines a polynomial functional of degree m as a
function
\begin{equation}
f(\rho)=\sum_{i_1,j_1,\ldots,i_m,j_m}c_{i_1 j_1\ldots i_m j_m}
  \rho_{i_1 j_1}\rho_{i_2 j_2}\cdots\rho_{i_m}\rho_{j_m},
\label{fofrho}
\end{equation}
where the $c_{i_1 j_1 \ldots i_m j_m}$ are arbitrary complex
constants.

Now suppose that we have available $m$ copies of a system in
state $\rho$.  The density matrix for the joint state of the
system will simply be the tensor product of $\rho$ with itself $m$
times.
\begin{equation}
\rho^{\otimes m}=\sum_{i_1, j_1, \ldots, i_m, j_m}\rho_{i_1
  j_1}\cdots \rho_{i_m
    j_m}\ket{i_1}\bra{j_1}\otimes\cdots\otimes\ket{i_m}\bra{j_m}
\label{rhootimesm}.
\end{equation}

Now each term in $f(\rho)$ will correspond to a term of
\ref{rhootimesm}.  If we write an operator 
\begin{equation}
\hat{A}_{i_1j_1\ldots i_m j_m}\ket{j_1}\bra{i_1}\otimes\cdots\otimes\ket{j_m}\bra{i_m},
\end{equation}
then if we calculate the expectation value of $\hat{A}$ for the
state $\rho^{\otimes m}$ we obtain:
\begin{equation}
\text{Tr}\left\{ \hat{A}_{i_1j_1\ldots i_m j_m}\rho^{\otimes m}\right\}=\rho_{i_1 j_1}\cdots\rho_{i_m j_m},
\end{equation}
which is one of the terms making up $f(\rho)$.  Since $f(\rho)$
is linear in such terms, we can always construct an operator
$\hat{B}$ as a linear combination of operators $\hat{A}$ such
that $f(\rho)$ is the expectation value of $\hat{B}$\footnote{You
  may be wondering about terms in $f(\rho)$ of order less than
  $m$.  These can be accounted for by rewriting them as sums of
  terms of order $m$ and using the property that $\sum_i
  \rho_{ii}=1$ to eliminate the unwanted terms.}.

We can apply this technique to a particular figure of merit
of interest, the purity, defined as 
\begin{equation}
P=\text{Tr}\left\{ \rho^2\right\}.
\end{equation}
This figure of merit provides a measure of the width of the
distribution $\rho$ over the pure quantum states.  It is
unitarily invariant and has proven a useful measure of
entanglement\cite{MikeandIke}, classicality\cite{Ghose2007} and even
effective temperature\cite{Anwar2004}.  In a
$D$-dimensional Hilbert space it is equal to
one for a pure state and to $1/D$ for a maximally mixed state.
States that are neither pure, nor maximally mixed have
intermediate values of the purity.

Treating the purity as a function of the density matrix
elements, if we have a single-qubit density matrix 
\begin{equation}
\rho=\left(
\begin{array}{cc}
\rho_{00} &\rho_{01}\\
\rho_{10} &\rho_{11}
\end{array}
\right),
\end{equation}
then the purity is 
\begin{equation}
P=\rho_{00}^2+\rho_{01}\rho_{10}+\rho_{10}\rho_{01}+\rho_{11}^2.
\end{equation}
Following Brun's procedure we can define a corresponding
two-qubit operator ($m=2$ here) $\hat{A}$ whose expectation value will be the purity:  
\begin{equation}
\hat{A}=\ket{0}\bra{0}\otimes\ket{0}\bra{0}+\ket{0}\bra{1}\otimes\ket{1}\bra{0}+\ket{1}\bra{0}\otimes\ket{0}\bra{1}+\ket{1}\bra{1}\otimes\ket{1}\bra{1}.
\end{equation}
On the face of it this operator looks like a rather complicated
one to measure.  We can simplify it by using the identity
operator $\mathbb{I}_4$
\begin{equation}
\mathbb{I}_4=\ket{0}\bra{0}\otimes\ket{0}\bra{0}+\ket{0}\bra{1}\otimes\ket{0}\bra{1}+\ket{1}\bra{0}\otimes\ket{1}\bra{0}+\ket{1}\bra{1}\otimes\ket{1}\bra{1},
\end{equation}
so that
\begin{align}
\hat{A}&=\mathbb{I}_4-\ket{01}\bra{01}-\ket{10}\bra{10}+\ket{01}\bra{10}+\ket{10}\bra{01}\notag\\
&=\mathbb{I}_4-2\ket{\psi^{-}}\bra{\psi^{-}},
\end{align}
where we have introduced the two-qubit singlet-state
$\ket{\psi^-}=\frac{1}{\sqrt{2}}\left(\ket{01}-\ket{10}\right)$.
  Writing out the expectation value of $\hat{A}$ we get
\begin{align}
P&=\text{Tr}\left\{\left(\mathbb{I}_4-2\ket{\psi^-}\bra{\psi^-}\right)\rho^{\otimes
    2}\right\} \notag\\
&=1-2\bra{\psi^-}\rho^{\otimes 2}\ket{\psi^-}.
\label{purityformula}
\end{align}
Thus a single two-qubit expectation value measurement, namely
the expectation value of the singlet-state projector, is
sufficient to measure the purity.  In contrast it would have taken
four linearly independent expectation values to perform a full
characterization of the single qubit state.  This four-fold
reduction is perhaps not especially impressive, especially as
the single-particle measurements required for tomography will
generally be easier to obtain that the two-particle joint
measurement.  The real advantage of Brun's method, though, is that it
completely changes the scaling.  While for tomography the number
of measurements scales exponentially in the system size, for
Brun's method the complexity is set by the degree of the
polynomial in the density matrix functional.  A million-qubit
quantum computer which would be inconceivable to characterize
using tomography could, in principle have its purity measured by
Brun's technique by running two copies of the computer through a
calculation and then having a joint measurement applied to the two
copies at the end.  This is a truly scalable approach to state
characterization, and one that will become increasingly
important as quantum machines grow beyond a few qubits.

That the purity in the single-qubit case is so closely related
to the  singlet-state
projection is not so surprising.  As has been discussed in
previous chapters, the singlet state measurement is an
anti-symmetry measurement in every basis.  It assumes the form
$\frac{1}{\sqrt{2}}\left(\ket{\psi\bar{\psi}}-\ket{\bar{\psi}\psi}\right)$
in all bases, where $\ket{\bar{\psi}}$ is a single-qubit state orthogonal to
the single-qubit state $\ket{\psi}$.  This means that a state of the form
$\ket{\psi}\ket{\psi}$ will have a singlet projection of zero.
Thus when $\rho$ represents a pure state we recover from equation
\ref{purityformula} the correct value of $1$ for the purity.  

When $\rho$ is the maximally mixed state,
$\frac{1}{2}\mathbb{I}_2$, the purity is $1/2$.  In this case
$\rho^{\otimes 2}=\frac{1}{4}\mathbb{I}_4$.  Since the
singlet-state is one of the four Bell states and the maximally
mixed state is rotationally invariant (meaning it looks the same
in the Bell basis as in any other basis), its singlet state
projection is $1/4$ and equation \ref{purityformula} gives the
correct value of $1/2$ for the purity.

One way to think of this is to imagine the density matrix as
literally describing a box of qubits prepared in one of two
orthogonal states with equal probability.  Two qubits at a time
are selected from the box and the singlet state projection of
the joint state is measured.  There is a $50\%$ chance that the
two selected photons will be in the same state, in which case
the singlet state projection is zero.  There is also a $50\%$
chance that the two photons will be in orthogonal states in
which case the singlet state projection is $1/2$.  On average,
then, the singlet state projection is $0.5\times0+0.5\times
1/2=1/4$.  A similar reasoning can be applied to states that are
mixed but not maximally mixed, by imagining a box filled with
states orthogonal in the basis in which the density matrix
$\rho$ is diagonal.  If one state has probability $p$ and the
other probability $(1-p)$ then the probability of selecting two
of the same state is $p^2+(1-p)^2$ while the probability of
selecting two orthogonal states is $2p(1-p)$.  The former case
results in zero singlet-state projection and the latter a
projection of $0.5$.  It follows from formula
\ref{purityformula} that the purity is $P=1-2p(1-p)=p^2+(1-p)^2$
which is what you would get by squaring and taking the trace of
a diagonal density matrix with entries $p$ and $(1-p)$ along the diagonals.   

The problem of measuring the purity without measuring the
density matrix therefore reduces to the problem of measuring the
singlet state projection, which, as we've seen in previous
chapters, can be implemented using two-photon interference\cite{Hong1987}.

\section{Experimental techniques}
\subsection{Singlet state projection}

\begin{figure}
  \centerline{
    \mbox{\includegraphics[width=\textwidth]{Figures/BrunApparatus.eps}}
    }
  \caption{Experimental setup for (a) the direct purity measurement and (b) quantum state
  tomography.
  Labels designate a 50/50 beamsplitter (BS), a non-linear $\beta$-Barium
  Borate (BBO) crystal, half-waveplates (HWP), liquid-crystal variable waveplates (LCWP),
  single photon counting modules (SPCM) and a polarizing beamsplitter (PBS).
  A type-I spontaneous parametric downconversion (SPDC) crystal produces pairs of H-polarized
  photons.  In (a), the same state $\rho$ is prepared in both arms
  and the beamsplitter acts as a singlet state filter.  In (b), a
  state is prepared in one of the arms and polarimetry is used to
  measure the density matrix of the state using a quarter-waveplate, a half-waveplate and a polarizing beamsplitter.}
  \label{brunapparatus}
  \end{figure}

As in the mutually-unbiased basis tomography experiment, we
perform joint measurements by using two-photon interference.
Since in order to implement the purity measurement we only need
to be able to do a singlet state projection, there is no need to
rotate the measurement to other bases or even to use a polarizer
(in fact, using a polarizer would break the unitary invariance
of the system and we'd no longer be measuring purity!).  We simply set up the
beamsplitter as shown in Figure \ref{brunapparatus}(a), prepare
the two input photons to have the same density matrix, and
measure the rate of coincidence detections.  

Since the visibility of the two-photon interference is not
$100\%$, we need a way to relate the purity not just to the
singlet state projection, but to the measured projection with
the limited visibility.  We will make the useful
assumption that the factors that limit the visibility such as
imperfect mode-matching, imperfect splitting ratio and so on, are not
themselves polarization-sensitive.  Under this assumption the
measured projector can be written as a singlet state projector
plus a term proportional to identity that carries no
polarization information but adds a background to the
measurement.   If the maximum visibility possible given these
limiting factors were $90\%$, then the projector 
\begin{equation}
{\hat{P}}_\text{actual}=0.10 \mathbb{I}_4+0.80 \ket{\psi^-}\bra{\psi^-}
\end{equation}  
satisfies the necessary conditions that it goes to 0.90 when the
state is pure singlet, to 0.10 when the state is symmetric and to
0.50 when the state contains equal symmetric and antisymmetric
components such as for $\ket{HV}$.  Physically we can interpret
the first term as representative of the fact that with imperfect
visibility all states have some likelihood of producing a
coincidence.  

Using this formula, the purity can be written as
\begin{equation}
P=1-2\frac{\left<{\hat{P}}_\text{actual}\right>-0.1}{0.80}.
\end{equation}
This adaptation of the problem to the real world of
imperfect visibilities introduces a new issue.  It is now
possible that for a finite number of detector clicks,
the value of the measured purity will be greater than $1$ due to
statistical noise in the estimation of the expectation value.
This is not as bad as it seems.  The whole analysis has assumed
that we can measure expectation values whereas in reality
we can only ever estimate them.  While it is true that if the
two-photon interference visibility were perfect then it would be
impossible to observe a purity greater than one, this is more of
a fluke than a qualitative difference.  It arises because a pure
state has a singlet state projection of zero and the variance in a
Poisson distribution with a mean of zero is zero.
Even with perfect singlet state projections it is still possible to
get unphysical  purity
measurements at the \emph{lower} end of the purity scale.  The minimum
purity for a qubit is $1/2$ which will occur when the state is
maximally mixed, but then the singlet state projection is $1/4$
and so there will be statistical noise on it, so it is certainly
possible for formula \ref{purityformula} to spit out unphysical
values of the purity if the singlet state bracket refers
to \emph{estimates} of the expectation value rather than the true
expectation value.

One cannot get unphysical values of the purity from a density
matrix, but in that case physicality has been added in as an
assumption by requiring that the density matrix be positive
semi-definite.  One could do the same with these direct
approaches to measuring purity by applying maximum-likelihood
fitting.  Such an approach would rephrase the problem as what
value of purity on the interval $[1/2,1]$
was most likely to have given the observed measurement.  This
was not done in the experiment, but in hindsight that probably
would have been a good idea. 

Figure \ref{brunapparatus}b shows how we measure the single-photon
polarization density matrix for one of the photons.  It is a
textbook polarization analyzer, with a QWP, HWP and polarizer
in front of the detector.  Projections onto
$\left\{\ket{H},\ket{V},\ket{D},\ket{A},\ket{R},\ket{L}\right\}$
were measured and the single-qubit density matrix was found by
maximum likelihood fitting.  In the experiment we measured the
single-qubit expectation values from the the singles rate at the
detector, not the coincidence rate.  We measured the total
intensity of light reaching the detector, then turned off the
pump laser and measured the background due to residual
room lights and detector dark counts.  The difference between
these two detection rates was used in the tomographic reconstruction.
A better approach would have been to use a scissor jack to remove the beamsplitter from
the system and then rely on coincidence detection with an
analyzer on the side to be characterized and no polarizer on the
other side.  This approach would have been background-free, and
much more elegant.  Unfortunately it didn't occur to me at the
time.  Luckily, the density matrix is made up of expectation
values, not of higher moments in the statistical distribution of
counts, so as long as the background rates were stable (which
they were over an hour or so), the density matrices produced by
measuring singles were indistinguishable from those taken using coincidences.

\subsection{Making impure states}
Unique among candidate quantum information systems, photon
polarization states, by their nature, are angelically pure.  As
Andrew White is fond of saying, the fact that a residual
polarization can be measured in the cosmic microwave background
indicates that photon coherence times are on the order of
$10^{17}$ s.  In non-photonic systems impurity arises due to
interactions with the `environment', the other quantum systems
to which the system under study is coupled.  For photons,
propagation in free space induces no decoherence.  Propagation
through linear optical materials induce unitary, rather than
non-unitary evolution.  In an experiment like this, where we
\emph{want} to study photon impurity, what options do we have to
obtain it?

A quantum-information approach is to use quantum randomness to
make an impure state.  One could, for example, create a
Bell-state of two photons and then send one of them into an
absorbing medium.  The photons start out in a maximally
entangled state, and the absorption of one photon creates a
maximally entangled state of the other photon and the electron
spins in the absorbing medium.  These quickly couple to
everything else in the medium and we have a situation that looks
very much like environment-induced decoherence in that we
couldn't possibly retrieve the correlation information if we
wanted to, so we can treat the first photon as being impure.
This approach has been used in a few experiments\cite{Bovi2005}, but it
requires having at least as many photon pairs as you need impure
photons at the end.  For the Brun scheme we would need two
photon pairs, and we were not set up to create them at the time
of the experiment.

Another technique, used, for instance, by Wei et al.\cite{Wei2005} is to use the photons' other
degrees of freedom as the `environment' and then simply not
detect those other degrees of freedom.  So, for example, if the
photon is sent through a birefringent medium whose birefringent
delay is greater than its coherence time, then after propagating
through the medium, the extraordinary and ordinary polarization components of
the photon will have essentially random relative phases.  If an equal
superposition of extraordinary and ordinary polarizations are
sent in (say $\ket{D}$ with $H$ and $V$ as the ordinary and extraordinary
crystal axis directions), what emerges will be an entangled
state of delay and polarization
$\frac{1}{\sqrt{2}}\left(\ket{H,E}+\ket{V,L}\right)$ where $E$ and
$L$ are early and late time bins with a separation larger than the
coherence time.  If the photon detectors are insensitive to the
time delay - and they will be, at least in usual SPDC
experiments, then they effectively trace over the time
information and we are left with a mixture of $\ket{H}$ and
$\ket{V}$.  One can also think about this decoherence process in the
frequency domain where the relative group delay between
$\ket{H}$ and $\ket{V}$ can be thought of as a phase shift that
depends linearly on frequency.  If the detector can collect a
wide range of frequencies then it collects photons with a wide
range of birefringent phase shifts for the different frequencies
in the light.  This will appear the same as if the photons had
been given random phase shifts.

A third way of generating impurity is to directly apply random
unitaries to the photons.  For example, one could split the
photon amplitudes for $H$ and $V$ at a PBS and put a heat source
in one arm to cause random changes in the index of refraction of
air before recombining them at a second PBS\cite{Mohseni2003}.  Or one
could, as
we did in this experiment, hook a liquid crystal waveplate up to
a pseudo-random number generator and produce well-defined phase
shifts that depend on the random number generated.
On the face of it, these sources of impurity seem somewhat
contrived as compared to the first two, but they are not
fundamentally any different.  If impurity is generated by the
loss of a single photon from a Bell pair, then the information about
its polarization is still contained \emph{somewhere}.  If it gets
absorbed then it transfers its polarization state to the
absorbing material and, in principle, one could find the
electron whose spin was flipped and
use the Bell state correlation to determine the polarization of
the first photon.  Similarly, a sufficiently fast or spectrally
sensitive detector could, in principle, resolve the frequency-dependent
phase shift induced by a birefringent group delay.  As far as Nature is
concerned in all three cases impurity is created because we
\emph{choose} not to look at information that is, in principle, available
to be measured.  

Of these three possible methods of creating impure photon states
we tried applying Brun's method to the latter two.  For the
birefringent group-delay-induced impurity we used AR-coated
pieces of quartz of varying thickness.  The light was prepared
in $\ket{D}$ and the quartz was aligned to have its optical axis
vertical or horizontal.  After the quartz, the light was
depolarized to a varying degree that depended on quartz
thickness.  The thickest piece was $25$ mm long and was enough
to induce a group delay of 237 $\mu$m between the two
polarizations which is far more than the coherence length of 49
$\mu$m determined by the 10 nm bandwidth of the interference
filters used.  It was therefore possible to produced maximally mixed
states with these quartz pieces.

For the impurity created by the liquid-crystal waveplates, two
different methods were used.  In the first method the waveplates
were calibrated so that by applying known voltages, known phase
shifts of $0$, $\pi$, and $\pi/2$ radians could be obtained
between the $H$ and $V$ polarizations.  A software pseudo-random number
generator selected between these possibilities and the phase
shift was applied.  In the second method, the function of phase
shift versus voltage was mapped out for phase shifts between $0$
and $\pi$ and used to create a probability distribution that
produced uniformly distributed phase shifts on this interval.  A
pseudo-random number generator selected voltages from this
interval at random.  

\section{Results}
\subsection{Density matrices}
The density matrices for several polarization states measured
using the apparatus in figure
\ref{brunapparatus}(b) are shown in figure \ref{brundms}.  The
pure states $\ket{H}$ and $\ket{D}$ have measured purities of
$0.99\pm0.01$, and the various mixed states, which were all
generated using the LCWP and pseudo-random number generator, 
have purities that match to within $0.02$ the expectations based on the
distribution of states that make them up.  For the random state
selection the uncertainty had a contribution from the statistics
of the state selection process as well as the photon counting
statistics.
\begin{figure}
\includegraphics[width=\columnwidth]{Figures/dms.eps}
\caption{Experimental density matrices for various single-photon polarization
  states.
  (a) pure horizontal $\ket{H}$
  (b) pure diagonal $\ket{+}$
  (c) equal mixture of $\ket{+}$ and $\ket{-}$
  (d) equal mixture of $\ket{+}$, $\ket{-}$ and $\ket{R}=(\ket{H}-i
  \ket{V})/\sqrt{2}$
  (e) A mixture of states of the form $(\ket{H}+e^{i \phi}\ket{V})/\sqrt{2}$ with $\phi$ distributed
  equally over $\left[0, \pi\right]$}
\label{brundms}
\end{figure}
The states examined were $\ket{H}$, $\ket{D}$ obtained by
rotating $\ket{H}$ with a half-waveplate, an equal mixture of
$\ket{D}$ and $\ket{A}$ obtained by randomly applying either $0$
or $\pi$ birefringent phase shifts to $\ket{D}$, a partially
mixed state obtained by applying a phase shift of $0$, $\pi/2$
or $\pi$ to $\ket{D}$, and finally, the state obtained when
random phase shifts selected uniformly over the interval $0$ to $\pi$ were
applied to $\ket{D}$.  

The purities measured indirectly using quantum state
tomography, are compared to the direct purity measurement in
table \ref{brunpuritycomparison}.  The theoretical purities
can be calculated by summing density matrices over the
probability distribution of having different phases.  The
calculation is left as an exercise for the reader.

\begin{table}[!t]
\begin{tabular}
[c]{|c|ccc|}%
\hline
&&{\bf Purities}&\\
{\bf State} &{\bf Direct}&{\bf Tomographic}&{\bf Theoretical}\\
\hline\hline
$\ket{H}$&$1.00\pm0.03$&$1.00\pm0.01$&$1$\\
\hline
$\ket{+}$&$0.99\pm0.03$&$0.98\pm0.01$&$1$\\
\hline
Equal mixture &
\multirow{2}{*}{$0.52\pm0.01$} & \multirow{2}{*}{$0.50\pm0.01$} & \multirow{2}{*}{$0.5$}\\
$\ket{+}$, $\ket{-}$ & & & \\
\hline
Equal mixture & \multirow{2}{*}{$0.568\pm0.008$} &\multirow{2}{*}{$0.56\pm0.01$}&\multirow{2}{*}{$5/9\approx0.5556$}\\
$\ket{+}$, $\ket{-}$, $\ket{R}$ & & & \\
\hline
$\ket{H}+e^{i\phi}\ket{V}$, &\multirow{2}{*}{$0.72\pm0.01$}&\multirow{2}{*}{$0.70\pm0.01$}&\multirow{2}{*}{$0.5+\frac{2}{\pi^2}\approx0.7026$}\\
$\phi \in \left[0,\pi\right]$ & & & \\
 \hline
\end{tabular}
\centering \caption{The purities measured for five states using the
direct joint purity measurement and a full characterization followed
by a calculation of $\text{Tr}\{\rho^2\}$.  The stated errors arise
from counting statistics and the statistics associated with the
random state selection.} \label{results}
\label{brunpuritycomparison}
\end{table}

It can be seen that the purities as measured by the two methods
are the same to within error, indicating that Brun's method does
indeed work on these states.  What is perhaps more surprising
are the cases where it does \emph{not} work.  The rest of this
section will be devoted to examining those cases.

The results in table \ref{brunpuritycomparison} are for impure
states generated with liquid crystal waveplates.  What happens
when the impurity is instead generated by applying a
birefringent group delay so as to entangle the photons' time
degrees of freedom with their polarization?  First, if we are
going to create impurity in this way then we have to decide whether
to apply a shift in the same direction to the two photons (say
delaying H relative to V for both), or in opposite directions.
Both actions will give the same single-qubit density matrix for
the individual photons, but could potentially affect the
two-photon interference, perhaps in such a way that it can no
longer be properly interpreted as the purity.  In fact, doing it
\emph{either} way affects the two-photon interference profoundly and
makes it impossible to interpret as a polarization singlet-state
measurement!  If the two crystals are aligned in the same
direction, then the $H$ amplitude is delayed for both the
photons.  The $V$ components of the state always arrive at the beamsplitter
first and the $H$ components arrive last.  Crucially, the $H$
components arrive together and the $V$ components arrive
together.  The $H$ components will interfere perfectly and the
$V$ components will interfere perfectly and one will record a
maximal drop in the coincidence detection rate at the detectors,
exactly as if the single-photon states were pure!  Things get
even worse if the crystal axes have opposite alignment.  In that
case, the $H$ photon will be delayed on one side and the $V$
photon will be delayed on the other.  The undelayed $H$ photon
will always arrive at the same time as the undelayed $V$ photon
and the delayed $H$ photon will arrive at the same time as the
delayed $V$ photon.  In either case the two photons are
completely distinguishable in polarization and no two-photon interference
occurs.  The singlet state projection in this case is $0.5$ (since
the singlet state is $50\%$ $\ket{HV}$ and $50\%$ $\ket{VH}$).  By
formula \ref{purityformula} this would constitute a purity of $0$
which is impossible.  

To get a sense of what is going on we can put a delay in one of
the arms and plot the coincidence rate as a function of that
delay.  In practice we need to do this anyway to figure out
where the two-photon interference is.  Figure \ref{brunHOMdips}(a) shows
what happens when the delays go in the same direction.  As one
photon wavepacket slides past the other in time there will be a
moment when the early $H$ packet overlaps with the late $V$
packet, but this does not cause interference because these two
states are orthogonal.  As the packets keep
sliding past each other, a point is reached when the two $H$
packets overlap and the two $V$ packets overlap and the maximal $90\%$
visibility is achieved.  Finally, on the way out, the late $V$
packet will overlap the early $H$ packet, but again there is no
interference.  Contrast that with the situation when the two
delays are in opposite directions.  Then as the arm is scanned a
point is reached when the early $H$ amplitude overlaps with the
late $H$ amplitude from the other photon.  These two amplitudes
interfere perfectly, but each photon is only $H$ $50\%$ of the
time.  The joint probability for both photons to be $H$ is $1/4$,
and the other $3/4$ of the time there is no interference.  A
similar thing happens when the late $V$ amplitude overlaps the
early $V$ amplitude.  Figure \ref{brunHOMdips}(b) shows this
situation which results in two distinct dips, each with a $25\%$
visibility.  


\begin{figure}[!t]
  \centerline{
    \mbox{\includegraphics[width=\columnwidth]{Figures/hom3.eps}}
  }
  \caption{(a) The two-photon interference dip when the crystals
    introduce the same delay at the two beamsplitter inputs
  (b) The two-photon interference dip when the two crystals
    introduce opposite delays at the two beamsplitter inputs. }
  \label{brunHOMdips}
  \end{figure}


So why does two-photon interference no longer measure the
impurity properly when the impurity is created by entanglement
between different photon degrees of freedom?  The answer is that
two-photon interference is not just a polarization interference
effect, but rather an interference effect that
depends on the other photon degrees of freedom, in this case the
timing information.  While the detectors are not sensitive to
the timing information, the interference \emph{is}, and this
means that the strength of the interference can no
longer be used as a polarization-singlet state filter.
Interestingly, if we were to somehow randomize the action of the
quartz crystals by sometimes advancing the $H$ polarization and
sometimes the $V$, then we'd end up with an equal probability of
having the two sides have the same delay and opposite delays.
The two-photon interference visibility would be the average of
the visibilities for the two relative orientations, namely $0$
and $1$ and so we would recover the correct result of a $50\%$
visibility or a singlet state projection of $0.25$ with a
corresponding purity of $1/2$.  The point is that this randomness
must be stored in a degree of freedom of some system other than
the photon in order for the two-photon interference to act
properly as a singlet-state filter.  

One can also think of this effect as being due to correlations
in the birefringent phases of the two photons.  Returning to the model
of the box full of photons,
we assumed at first that the two photons were selected at random
from the box.  Applying the same group delay to both inputs of
the beamsplitter is the same as applying the same linearly
dependent phase-shift with frequency with the same slope to the
two photons.  If somehow our picking of supposedly
mixed photons from the box always resulted in photons with the same
phase-frequency profile, then really we'd be in the case of a
pure state and would expect perfect two-photon interference.
Similarly if whenever we picked two photons they
always had oppositely sloped phase-frequency profiles we would
expect them to never interfere.  When using random LCWP phase
shifts we can create a similar kind of situation when the phase
shift applied to the LCWP is the same for both photons, even
though this phase shift is varied randomly from photon pair to
photon pair.  When this is done, the single-qubit density matrix
is measured to be maximally mixed, but the interference is
consistent with a perfectly pure state.  This can be seen in the
two-photon interference dip and measured density matrix in
figure \ref{brunsamephaseshift}.
\begin{figure}[!t]
  \centerline{
    \mbox{\includegraphics[width=\columnwidth]{Figures/LCsame.eps}}
  }
\caption{Hong-Ou-Mandel interference dip taken while random phases
  were applied to the two LCWPs in figure \ref{brunapparatus}a, but in
  a correlated way so that the phase on either side was the same at
  any given moment.  While the density matrices measured for either
  photon show a maximally mixed state, this high-visibility dip
  would usually be indicative of a completely pure state.}
\label{brunsamephaseshift}
\end{figure}
\section{Discussion}
We have seen that in the `ordinary' case of impurity caused by
an external source of randomness, the Brun approach to direct
purity measurement produces the same results as the old indirect
approach of calculating purity from the density matrix.  The
amazing thing about this technique is that the number of
measurements required to obtain the purity does
not increase as a function of the size of the Hilbert space.  No
matter how big the system being measured, a single joint measurement
on two copies of that system is sufficient to determine the purity.  We
could equally well imagine a box full of million-qubit quantum
computers that had reached the end of some calculation.  We
could obtain their purity by pulling them out two at a time and
performing a joint measurement on their million-qubit state.
This will undoubtedly be much easier than taking the
$4^{1,000,000}$ individual measurements that would be required
to determine the density matrix and calculate the purity from that.

More generally, this approach to finding ways to directly obtain
figures of merit on quantum systems will
become an increasingly important characterization method as quantum
systems become more complex.  The work presented in this chapter
is an important proof-of-principle in this regard.

This work also shows that states with the same benign-looking
single-qubit mixed-state density matrix can exhibit very
different behaviour under certain measurements.  The
writing down of a reduced density matrix describing a system
puts up an artificial border between that system and the rest
of the universe.  To talk about the purity of that reduced
density matrix is to assume that the randomness that makes it
impure is sufficiently well-separated from the system that no
future measurements will have access to it.  The example
presented in this chapter shows that a common means of creating
impurity is fundamentally incompatible with the most common means of making
two-photon joint measurements and interactions, and that a new
method of creating impurity needs to be employed, one that
safely stores the randomness in a completely different system.
This effect, which is certainly an important one for photon
polarization, may well be important in other
systems where the environment is inefficient at ferrying away
entropy.  
